-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to the reusable tox workflow #1102
Conversation
35590ba
to
7459db7
Compare
7459db7
to
8ab0996
Compare
- | ||
- "requirements/*/*.txt" | ||
- "pyproject.toml" | ||
- "toxfile.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been wondering, over the past week, about whether or not tox-uv
's faster venv building makes it unnecessary to cache the .tox
dir contents. As long as the uv
action's cache is populated, .tox/
can be quickly rebuilt.
One of the things I wonder is whether or not the balance between the two may, in fact, favor rebuilding over caching (since caching and hashing take some time).
I'm curious if you've given this any thought?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have. Caching the tarballs and wheels, instead of caching everything that was installed, hasn't previously been faster.
The numbers are borne out best on Windows, so I'll share from the feedparser logs, which tests the highest and lowest supported CPython versions (and which I recommend doing here, but didn't introduce in this PR).
Here's the timings reported by feedparser tests for Windows with a cache miss:
py3.9-chardet: OK (45.76=setup[7.22]+cmd[38.55] seconds)
py3.13-chardet: OK (41.80=setup[9.69]+cmd[32.11] seconds)
congratulations :) (87.68 seconds)
and for a cache hit:
py3.9-chardet: OK (42.22=setup[3.74]+cmd[38.48] seconds)
py3.13-chardet: OK (31.66=setup[0.21]+cmd[31.45] seconds)
congratulations :) (74.01 seconds)
(Note that the first tox environment always has the wheel build step counted in as a part of its setup
.) Since the cmd
times per tox environment are within ~0.5s of each other between the cache-miss and cache-hit executions, I'm more inclined to trust that the setup
times aren't simply GitHub runner jitter.
So, my interpretation is that this is a win of ~13 seconds across 2 tox environments on Windows.
It took 1 second to look up the cache and miss, and then 5 seconds to upload the cache from the cache-miss job; it subsequently took 2 seconds to download the cache for the cache-hit job, which is an additional ~4 seconds won.
I have consistently found that it's faster to cache what's installed, rather than caching what needs to be installed. tox-uv
makes environment creation and package installation fast, but I don't think it's fast enough.
You're welcome to try improving on this! It's mechanically trivial, but extremely time-consuming. Here's the steps:
- Create a branch off this project (or my own workflow repo)
- Point a second project with a "significant" test suite at the new branch
- Repeatedly push and force-push to the second project, possibly manually deleting the caches, and keep switching back to the workflow project to make and push changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this explanation 110% satisfactory. I'm probably not going to experiment with this at least within the next few days: my main question was about the comparison between time(cache miss + tox_uv setup + cache save)
vs time(cache hit + tox_uv setup)
and you've already provided numbers for that.
I am willing to accept some minor regressions in CI speeds if it gives us other improvements (e.g., workflow simplicity). In particular, I've been trying to track in the PRs as you've converted us over to the new workflow -- what exactly is being used for cache keys and is it "correct"?
The uv
action cache carries all of the raw packages already (in the runner's homedir), so there's some interesting interplay there with the .tox
dir.
Thanks for laying this all out for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think I see what you're referring to. This isn't using the uv
GitHub action, so there's no side caching happening, and pip caching isn't enabled for the setup-python
action.
For cache keys, here's the rule that I've generally been following:
Already-included files
These files are always included by the reusable workflow:
-
.python-identifiers
(generated by the
kurtmckee/detect-pythons
action; ensures that the cache -- which contains symlinks to Python interpreter executables -- is invalidated if the Python versions change) -
.workflow-config.json
(ensures that changes to the requested configuration invalidates the cache)
-
tox.ini
(ensures that changes to the tox configuration invalidates the cache)
Files you should use with cache-key-hash-files
In general, any files that contain tool configuration directives should be hashed for cache-busting.
pyproject.toml
mypy.ini
.flake8
.pre-commit-config.yaml
setup.cfg
requirements/*/*.txt
poetry.lock
If these files change, it can indicate that different dependencies should be installed, or that a tool like mypy should change how it's writing its own cache, or any number of other things that might make the workflow cache less useful.
This PR migrates the SDK to the reusable tox workflow.
📚 Documentation preview 📚: https://globus-sdk-python--1102.org.readthedocs.build/en/1102/